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1  Target  Issue:  Complex  problem  solving  in  very  distant  spatial  tasks. 

Experienced  submarine  Approach  Officers  (AO)  have  developed  complex  skills  for 
mentally  turning  the  alphanumerics  of  passive  sonar  into  spatial  representations  of  other  vessels, 
their  paths,  intentions,  and  the  high  uncertainty  of  the  undersea  world  (1997).  How  they  make 
this  translation  and  what  is  contained  in  the  spatial  representations  remains  unclear.  The 
translation  is  one  of  the  most  difficult  tasks  for  submariners  to  learn,  and  once  mastered,  it  is  still 
a  skill  that  submariners  find  critical,  but  complex  and  subject  to  error  (Kirschenbaum,  personal 
communication). 

Not  all  kinds  of  space  are  psychologically  the  same.  Developmental,  neuropsychological, 
and  adult  behavioral  data  suggest  that  different  sizes  of  3 -dimensional  space  are  processed  in 
very  different  ways  (Huttenlocher,  Newcombe,  &  Sandberg,  1994;  Previc,  1998;  Weatherford, 
1982).  For  example,  Previc  (1998)  distinguishes  four  major  categories  of  spatial  sizes: 
peripersonal  (visuomotor  operations  in  near-body  space),  focal  extrapersonal  (visual  search  and 
object  recognition),  action  extra-personal  (orienting  in  topographically  defined  space),  and 
ambient  extrapersonal  (orienting  in  earth-fixed  space).  AOs  make  use  of  all  of  these  categories  of 
space,  and  thus  accounts  of  the  complex  problem  solving  in  which  AOs  engage  must  take  into 
account  the  properties  of  each  of  the  spatial  grain-size  activities.  However,  it  is  currently  unclear 
which  representation  categories  are  important  for  a  given  task.  It  is  also  currently  unclear  what 
psychological  mechanisms  and  representations  implement  each  category. 

The  goal  of  this  literature  review  is  to  evaluate  our  current  understanding  of  spatial 
information  is  represented,  such  that  this  understanding  can  guide  the  development  of 
computational  models  for  complex  spatial  problem  solving  like  in  submarine  target  motion 
analysis. 

2  Framework  &  Central  Questions 

Before  covering  psychological  findings  regarding  representations  of  human  visual  space, 

I  first  present  a  general  framework  in  the  form  of  common  distinctions  and  central  questions 
about  representations  of  human  visual  space. 

2.1  Egocentric  vs.  Exocentric  Frames  of  Reference 

One  of  the  most  basic  questions  about  space  is  the  frame  of  reference  or  coordinate 
system  in  to  which  all  objects  and  points  are  defined  in  relative  terms.  Two  primary  frames  of 
reference  are  generally  distinguished  and  go  by  the  terms  egocentric  and  exocentric.  Egocentric 
frames  of  reference  use  the  observer  as  the  center  of  reference  with  their  orientation  as  the 
referential  axis  of  orientation.  Exocentric  frames  of  reference  use  the  some  other  point  or  some 
other  axis  of  orientation  defined  outside  of  the  observer  as  the  frame  of  reference  (e.g.,  the  earth 
or  the  main  axis  of  a  room). 

One  can  always  convert  from  a  location  in  one  frame  of  reference  to  a  location  in  another 
frame  of  reference  (e.g.,  from  ego  to  exocentric).  But,  this  distinction  is  important  because  the 
two  types  of  frames  of  reference  appear  to  have  different  pieces  of  primitive  information  which 
are  automatically  and  directly  represented  (Klatzky,  1998),  which  can  have  a  large  influence  on 
performance. 
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2.2  Types  of  Spatial  Knowledge 

In  representing  space,  three  main  types  of  information  can  be  represented.  The  first  is 
landmark  information.  Landmarks  are  familiar  objects  that  serve  as  isolated  reference  points  in 
our  spatial  knowledge.  These  landmarks  can  be  used  in  conjunction  with  egocentric  frames  of 
reference  (e.g.,  turn  left  when  you  get  to  a  fork  in  the  road)  or  with  exocentric  frames  of 
reference  (e.g.,  turn  south  when  you  get  to  a  fork  in  the  road). 

The  second  type  of  knowledge  is  route  knowledge.  Routes  are  paths  through  a  space  and 
include  some  topological  information  and  some  metric  information.  Route  knowledge  is  likely  to 
have  declarative  components  (direct  representations  of  space)  and  procedural  components 
(knowledge  of  what  to  do  in  different  regions).  Routes  can  also  be  defined  either  in  ego  or 
exocentric  terms  (Hunt  &  Waller,  1999).  Use  of  local,  exocentric  route  cues  is  called  tracking 
(e.g.,  following  highway  signs).  Use  of  egocentric  turns  and  distances  is  called  dead  reckoning 
(e.g.,  go  3  paces  straight  ahead).  Integration  of  egocentric  bearings  with  exocentric  locations  is 
called  piloting  (e.g.,  go  1  mile  straight  past  the  2nd  light). 

The  third  type  of  knowledge  is  configuration  information  and  consists  of  a  global,  metric 
map  of  a  space.  Configuration  knowledge  can  also  be  egocentric  or  exocentric.  Configuration 
knowledge  is  the  most  complete  representation  of  a  space  and  is  usually  the  most  difficult  to 
maintain  (Linberg  &  Garling,  1983)  and  last  to  develop  (Foley  &  Cohen,  1984;  Siegel  &  White, 
1975). 

2.3  Symbols  versus  images 

A  long-standing  debate  in  cognitive  science  surrounds  that  format  of  internal 
representations:  is  it  represented  symbolically  or  perceptually?  Some  researchers  argued  that  all 
representations  are  inherently  symbolic  whether  perceptual-like  or  not  (Vera  &  Simon,  1993). 
Others  argued  that  the  two  are  informationally  equivalent  and  that  we  will  never  be  able  to 
distinguish  the  two  empirically  (Pylyshyn,  1989).  However,  neuropsychological  evidence 
suggested  that  many  representations  are  indeed  perceptual  (Kosslyn,  1990).  More  recently, 
Barsalou  (1999)  has  argued  that  even  abstract,  conceptual  entities  may  have  a  perceptual 
representation. 

Very  much  related  to  this  issue  is  the  type  of  scale  included  in  a  representation  of  space. 
Four  different  scales  are  usually  distinguished:  nominal,  ordinal,  interval,  and  ratio.  Nominal 
scales  represent  entities  in  terms  of  unordered  categories  (e.g.,  left/right  or  locations  A,  B,  or  C). 
Ordinal  scales  represent  entities  in  terms  of  ordered  categories  (e.g.,  left/middle/right  or 
here/close-by/faraway  or  above/my-level/below).  Interval  scales  represent  entities  in  terms  of 
metric  amounts  (i.e.,  a  given  difference  has  the  same  meaning  across  the  scale),  but  there  is  no 
real  zero  point  (i.e.,  the  ratio  of  absolute  locations  has  no  meaning).  One  might  argue  that 
exocentric  coordinates  involve  essentially  interval  scales.  By  contrast,  ratio  scales  represent 
entities  in  metric  terms  with  a  meaningful  zero  (e.g.,  distance  in  meters  of  an  object  relative  to 
the  observer). 

2.4  One  or  Many  Representations 

A  central  question  in  studies  of  visual  space  is  whether  there  are  one  or  many 
representations.  This  question  decomposes  into  questions  about  whether  there  are  many  different 
formats,  different  processes,  and  different  locations  in  the  brain.  This  issue  is  of  particular 
relevance  to  this  proposal  because  it  turns  out  that  there  are  indeed  many  different 
representations  and  they  appear  to  vary  with  the  scale  of  the  space  being  represented.  Since 
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complex  spatial  tasks  like  the  one  described  at  the  beginning  of  the  background  section  involve 
interacting  with  space  at  multiple  scales,  it  becomes  especially  important  to  consider  multiple, 
different  spatial  representations. 

3.  Properties  of  Human  Visual  Space 

Visual  space  as  perceived  and  retrieved  from  memory  by  humans  is  not  a  simple 
reconstruction  of  Euclidean  space.  First,  the  deviations  from  Euclidean  space  are  systematic  and 
fall  into  several  different  categories.  Second,  it  appears  that  there  are  several  different 
representations  of  space  that  people  use.  Third,  it  appears  the  representations  within  an 
individual  fluctuated  over  time.  This  section  describes  each  of  these  results  and  the  empirical 
evidence  from  the  cognitive,  neuro,  mathematical,  and  developmental  psychology  supporting 
them. 

The  developmental  literature  is  applied  to  the  cognition  of  adults  in  the  following  way.  It 
is  assumed  that  two  things  are  likely  to  be  true  of  two  abilities  that  develop  at  different  times 
developmentally.  First,  the  later  developments  are  likely  to  indicate  different  underlying  process 
or  the  same  processes  working  on  different  representations.  Second,  it  is  likely  that  later 
occurring  developments  involve  functions  that  continue  to  be  processed  less  readily  in  adults 
than  those  associated  with  early  occurring  developments,  for  reasons  of  task  expertise  and 
sophistication  of  required  representation.  Abilities  that  have  developed  later  will,  by  definition, 
have  had  fewer  opportunities  to  be  practiced  and  thus  are  likely  to  have  been  practiced  less  and 
so  are  less  automated.  Functions  that  involve  the  same  processes  but  more  sophisticated 
representations  will  also  be  continued  to  be  processed  more  slowly. 

3.1  Biases  in  Representation 

3.1.1.  Curvature  in  the  horizontal  plane 

The  visual  space  that  we  perceive  has  systematic  curvature  in  the  horizontal  plane  that  is 
hyperbolic  in  form  (Indow,  1991,  1997;  Funeburg,  1947,  1950).  This  hyperbolic  bending  of 
space  has  two  features.  First,  visual  space  that  is  relatively  near  to  us  appears  concave  (bent 
towards  us)  and  space  that  is  relatively  far  from  us  appears  convex  (bent  away  from  us).  Second, 
lines  that  appear  to  be  straight  out  into  the  distance  are  actually  bent  outwards.  This  hyperbolic 
bending  of  space  applies  only  to  horizontal  planes  extending  out  from  our  eye-level  (either  flat  or 
tilted).  Frontoparallel  planes  of  space  (i.e.,  planes  in  front  of  us  parallel  to  our  bodies)  appear  to 
have  regular  Euclidean  geometry  (or  at  least  do  not  have  this  type  of  curvature). 
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Figure  1.  Series  of  points  on  a  flat  horizontal  plane  extending  from  the  observer  that  appear  to  be 
in  parallel  lines  (filled  circles)  or  equidistant  points  (open  circles).  Filled  in  squares  are  the  fixed 
points  used  in  horopter  experiments,  and  the  horizontal  curves  extending  through  them  represent 
H-curves.  Vertical  curves  are  theoretical  curves  representing  a  hyperbolic  model  of  visual  space 
fit  to  the  data. 

The  first  piece  of  evidence  for  the  hyperbolic  bending  of  space  comes  from  alley 
experiments.  In  alley  experiments,  participants  make  decisions  about  pairs  of  points  in  a 
frameless  visual  space  that  is  achieved  either  through  points  of  light  in  a  dark  room  or  small 
objects  in  an  evenly  illuminated  surface  with  invisible  edges.  Participants  adjust  the  points 
according  to  two  different  criteria  to  form  P-alleys  and  D-alleys.  For  P-alleys,  participants  are 
shown  pairs  of  points  extending  along  the  y-axis  (away  from  the  participant)  and  are  asked  to 
adjust  them  along  the  x-axis  (left-right  axis)  until  they  appear  to  form  straight  and  parallel  lines. 
For  D-alleys,  participants  are  again  shown  pairs  of  points  extending  in  the  y-axis  and  are  asked 
to  adjust  them  in  the  x-axis  such  that  each  pair  of  points  has  the  same  lateral  separation  as  that  of 
a  fixed,  reference  pair.  Although  the  two  types  of  alleys  produce  slightly  different  results 
(Blumenfeld,  1913;  Indow,  1991),  with  P-alleys  generally  lying  inside  D-alleys,  both  sets  of 
alleys  diverge  out  into  the  distance,  consistent  with  a  hyperbolic  bending  of  space. 

The  second  piece  of  evidence  for  the  hyperbolic  bending  of  visual  space  comes  from 
horopter  experiments  (Luneburg,  1950).  In  these  experiments,  participants  again  make 
judgments  in  a  frameless  visual  space.  Participants  adjust  series  of  points  along  the  y-axis  such 
that  they  appear  to  make  a  straight  line  that  runs  left-to-right  in  parallel  to  the  forehead  of  the 
participant.  Each  series  consists  of  three  points,  with  the  center  point  being  fixed  in  space.  The 
curve  through  these  three  points  is  called  an  H-curve.  This  process  is  repeated  for  series  of  points 
at  different  y-distances  from  the  observer  (see  Figure  1).  These  experiments  find  that  the 
participants' judgements  are  systematically  biased  such  that  the  H-curves  are  convex  at  distances 
far  from  the  observer  and  convex  at  distances  close  to  the  observer.  The  crossover  point  is 
somewhere  around  2  meters  away  from  the  observer  (Indow,  1991),  although  the  exact  crossover 
point  varies  across  individuals  (Luneburg,  1950). 
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The  exact  form  of  the  bending  is  an  issue  of  debate,  for  example  whether  the  (Gaussian) 
curvature  is  constant  across  all  points  in  space  (Eschenburg,  1980;  Indow,  1991).  The  degree  of 
bending  has  been  observed  to  depend  upon  the  nature  of  the  stimuli.  For  example,  illuminated 
spaces  produce  differential  bending  than  dark  spaces  (Indow,  1997;  Indow  &  Watanabe,  1984). 
Moreover,  more  bending  was  found  in  experiments  performed  in  large  spaces  (e.g.,  a 
gymnasium)  than  for  experiments  performed  in  small  laboratory  spaces  (Indow,  1991). 

3.1.2.  Two-dimensional  bias  towards  anchor  points 

When  visual  space  is  examined  in  the  context  of  other  objects  (in  contrast  to  frameless 
visual  space)  additional  biases  appear.  For  example,  when  people  must  reproduce  the  location  of 
a  point  in  two-dimensional  space,  their  estimates  for  the  location  of  that  point  is  systematically 
biased  towards  visual  anchor  points.  In  a  series  of  experiments,  Huttenlocher,  Hedges,  and 
Duncan  (1991)  found  that  when  asked  to  reproduce  the  location  of  a  dot  within  a  circle  people 
show  a  consistent  pattern  of  bias.  In  particular,  the  estimates  for  dot  locations  are  biased  toward 
the  center  of  mass  within  each  quadrant  of  the  circle  (see  Figure  2). 


Figure  2.  Schematic  representation  of  the  bias  in  reports  of  location  in  a  circle. 

To  account  for  these  results,  Huttenlocher  et  al.  postulated  that  memory  consists  of 
unbiased,  but  noisy  fine-grain  coding  of  information  together  with  gross-grain  categorical 
information  that  is  combined  to  produce  the  biased  estimates.  Subsequent  developmental  studies 
supported  this  hierarchical  decomposition  of  spatial  knowledge.  Huttenlocher,  Newcombe,  and 
Sandberg  (1994)  found  that  16-month-olds  treat  objects  as  whole  without  subparts  when  coding 
spatial  location  (i.e.,  show  no  systematic  biases  towards  anchor  points  within  the  object).  They 
also  found  that  children  subdivide  objects  of  increasing  complexity  as  they  get  older.  For 
example,  a  rectangle  is  subdivided  by  4-year-olds,  and  a  sandbox  is  subdivided  by  10-year-olds 
(i.e,  show  systematic  biases  towards  anchor  points  within  the  objects).  Follow-up  work  by 
Sandberg,  Huttenlocher,  and  Newcombe  (1996)  suggests  that  the  simplified  representations  used 
by  younger  children  are  likely  to  be  due  to  cognitive  load  rather  than  difficulties  in  representing 
particular  kinds  of  information  per  se. 

3.1.3.  Differential  treatment  of  geometrical  and  non-geometrical  landmarks 
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In  locating  objects  and  locations  in  the  environment,  people  use  two  different  types  of 
landmarks.  First,  they  use  large-scale  geometrical  landmarks  such  as  the  shape  of  the  terrain  (as 
determined  by  mountains,  valleys,  or  nearby  buildings)  in  outdoor  contexts,  and  the  overall 
geometric  shape  of  the  room  in  indoor  contexts.  Second,  they  use  non-geometrical  landmarks 
such  as  the  location  of  the  sun,  the  direction  of  the  wind,  the  color  of  a  wall,  the  patterning  of  a 
wall,  or  the  categorical  identify  of  objects  in  the  environment.  In  re-orienting  oneself  in  an 
environment,  the  individual  combines  these  two  types  of  information. 

Developmental  and  comparative  psychological  research  suggests  that  these  two  types  of 
information  are  represented  differently  and  develop  at  different  times.  For  example,  Hermer  and 
Spelke  (1994,  1996)  found  that  young  children  fail  to  use  nongeometric  landmarks  in  reorienting 
themselves  in  a  room,  whereas  adults  do  use  such  nongeometric  information.  Adult  rats  also 
have  such  difficulty  in  using  nongeometric  landmarks  (Margules  &  Gallistel,  1988).  Hermer  and 
Spelke  (1996)  found  that  the  young  children  were  able  to  detect,  remember,  and  use  the  same 
nongeometric  information.  They  argue  that  these  results  suggest  that  the  problem  is  not  one  of 
representation  but  rather  one  of  information  encapsulation:  the  information  is  encoded  so 
differently  that  more  sophisticated  processes  are  required  to  integrate  the  two  types  of 
information  to  be  used  in  spatial  re-orientation. 

3.1.4.  Influence  of  perceived  vs.  imagined  reality  conflicts  on  spatial  processing 

Some  spatial  reasoning  tasks  require  that  we  imagine  spatial  situations  other  than  those  in 
front  of  us.  The  developmental  literature  suggests  such  a  task  is  especially  difficult  when  the 
currently  perceived  reality  clashes  with  the  to-be-imagined  reality. 

Most  of  the  developmental  psychology  literature  on  spatial  reasoning  has  focused  on  the 
perspective-taking  problem,  first  addressed  by  Piaget.  The  common  finding  is  that  viewer 
rotation  is  harder  than  array  rotation,  and  children  below  9  or  10  cannot  solve  viewer  rotation 
problems.  That  is,  they  cannot  answer  the  question  "what  would  the  arrangement  of  objects  look 
like  from  the  other  side  of  the  table?"  but  they  can  answer  "what  would  the  arrangement  of 
objects  look  like  if  the  table  were  rotated?"  Piaget  and  followers  claimed  that  this  result  is  due  to 
egocentrism  (Flavell,  1968;  Piaget  &  Inhelder,  1966).  In  other  words,  the  claim  is  that  children 
below  a  certain  age  cannot  imagine  a  perspective  other  than  their  own.  However,  later  work  by 
Presson  (1982)  showed  that  the  even  adult  humans  process  the  two  types  of  situations 
differently. 

Huttenlocher  and  Presson  (1979)  showed  that  this  difficulty  in  viewer  rotation  is  not  due 
to  egocentrism  and  the  effect  can  be  reversed.  For  example,  they  found  that  eight-year-olds  could 
solve  problems  asking  them  to  imagine  the  relative  location  of  an  item  with  respect  to  a 
hypothetical  observer  (in  contrast  to  the  traditional  question  asking  about  the  relative  location  of 
items  to  each  other  from  the  perspective  of  a  hypothetical  observer).  Newcombe  and 
Huttenlocher  (1992)  extended  these  results  to  much  younger  children  and  argued  that  the 
difficulties  in  viewer  rotation  problems  is  due  to  conflicts  between  the  currently  perceived  reality 
and  the  reality  that  must  imagined  to  solve  the  task.  That  is,  if  the  two  are  too  similar  (requiring 
similar  objects  and  type  of  relational  encoding),  the  perceived  reality  intrudes  on  the  imagined 
one. 

3.1.5.  Non-equivalence  among  spatial  directions 

When  searching  through  imagined  3 -dimensional  environments,  not  all  spatial  directions 
are  searched  equally  well.  In  particular,  the  left/right  direction  appears  to  be  searched  more 
slowly  than  front/back  or  up/down  directions.  Franklin  and  Tversky  (1990)  had  participants 
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answer  questions  about  objects  place  in  an  imagined  environment  in  various  directions  relative  to 
the  participant.  Across  a  variety  of  experimental  manipulations,  the  left/right  dimension  was 
always  search  more  slowly  than  the  other  dimensions.  Note  that  left  and  right  refers  to  objects 
directly  to  the  side  of  the  observer  out  of  the  field  of  view,  not  to  objects  in  front  of  the  observer 
slightly  to  the  left  and  right  that  would  be  searched  quite  quickly. 

A  similar  slowing  of  the  left/right  dimension  was  found  by  Hintzman,  O'Dell,  and  Arndt 
(1981)  in  a  series  of  (2-dimensional)  experiments  in  which  participants  had  to  point  to  targets 
while  imagining  themselves  in  a  particular  spot  facing  in  various  directions.  Newcombe  and 
Huttenlocher  (1992)  also  found  that  left/right  is  harder  than  near/far  for  children — a  sticker  was 
placed  on  one  of  the  children's  hands  to  make  sure  the  problem  was  not  one  of  knowing  "left" 
and  "right"  labels. 

Franklin  and  Tversky  interpreted  their  results  in  terms  of  a  spatial  framework  model  in 
which  the  up/down  and  front/back  dimensions  are  more  functionally  important  in  the 
environment.  They  also  argued  that  their  results  ruled  out  a  spatial  transformation  account  in 
which  the  participant  mentally  rotated  in  their  environment  because  left/right  questions  (i.e.,  at 
90  degrees)  were  answered  significantly  more  slowly  than  questions  about  what  was  behind  (i.e., 
at  180  degrees). 

3.2  Multiple  Representations  of  Space 

3.2.1  Neuroscience  division  of  multiple  representations 

From  a  purely  neuroscience  perspective,  human  visual  space  is  incredibly  complex.  At 
the  process  level,  vision  is  a  very  complex  process  by  which  information  is  combined  to 
understand  (perceive  and  recognize)  the  3 -dimensional  world  from  essentially  2-dimensional 
information  on  the  retina.  More  importantly  to  this  proposal,  neuroscientists  have  long  called  for 
multiple  spatial  representations  based  on  dissociations  in  localized  human  brain  damage,  lesion 
studies  with  animals,  single-cell  recording  with  animals,  and,  more  recently,  brain  imaging 
studies  with  humans.  A  variety  of  proposals  have  been  put  forth  over  the  years  for  the  number  of 
representations  and  their  content  (Brain,  1941;  Cutting  &  Vishton,  1995;  Grusser,  1983; 
Mountcastle,  1976;  Pettigrew  &  Dreher,  1987;  Previc,  1990,  1998;  Rizzolatti,  Gentilucci,  & 
Matelli,  1985;  Rizzolatti,  Matelli,  &  Pavesi,  1983). 

The  most  comprehensive  model  is  that  of  Previc  (1998).  This  model  calls  for  4  different 
representations  of  space.  Each  representation  corresponds  to  different  spatial  domains,  going 
from  very  close  to  the  individual  to  very  far  away,  with  differential  degrees  of  emphasis  towards 
upper  and  lower  visual  fields  and  degree  of  coverage  around  the  body.  Each  representation 
involves  different  types  of  activities/functions  of  space.  Each  representation  has  different 
functional  properties.  Finally,  each  representation  has  different  anatomical  localizations  in  the 
brain.  Here  I  will  review  the  characteristics  of  each  representation.  See  Previc  (1998)  for  a 
detailed  review  of  the  neuroscience  evidence  supporting  these  distinctions.  Although  the 
different  spaces  also  involve  senses  beyond  vision  to  different  degrees,  I  focus  on  the 
relationship  to  vision  here  because  that  is  the  focus  on  my  proposed  research. 

The  first  representation  is  of  peripersonal  space  and  involves  visuomotor  operations  in 
near-body  space  (e.g.,  visual  grasping  and  manipulation).  It  extends  between  0  and  2  meters  from 
the  body,  in  a  60-degree  arc  in  front  of  the  body,  with  a  bias  towards  the  lower  visual  field.  It 
appears  to  use  egocentric  coordinates,  primarily  body-centered  (Gaffron,  1958;  Previc,  1990).  It 
appears  to  be  represented  primarily  in  dorsolateral  cortex. 
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The  second  representation  is  of  focal  extrapersonal  space  and  is  involved  in  visual  search 
and  object/face  recognition.  It  extends  from  .2  meters  from  the  body  into  the  distance,  in  a  25- 
degree  arc  in  front  of  the  body,  with  a  bias  towards  the  upper  visual  field.  It  appears  to  use 
egocentric  coordinates,  most  likely  using  retinotopic  coordinates  (Deneve  &  Pouget,  1998;  Farah 
&  Buxbaum,  1997).  It  appears  to  be  represented  primarily  in  ventrolateral  cortex. 

The  third  representation  is  of  action  extrapersonal  space  and  is  involved  in  navigation  (in 
relation  to  objects  and  topographically  defined  space),  scene  memory,  and  target  orientation.  It 
extends  from  2  meters  into  the  distance,  completely  around  the  individual  (although  with 
compression  outside  of  200  degrees),  with  a  bias  towards  the  upper  visual  field.  It  appears  to  use 
egocentric  coordinate,  particularly  gaze-centered  coordinates  (Bisiach  &  Luzzatti,  1978;  Gaffan, 
1991;  Rolls  &  O'Mara,  1995).  It  appears  to  be  represented  primarily  in  the  ventromedial  cortex. 

The  fourth  representation  is  of  ambient  extrapersonal  space  and  is  involved  in  spatial 
orientation,  postural  control  during  locomotion,  and  stabilizing  perception  of  the  world  during 
locomotion.  It  covers  region  more  than  2  meters  away  from  our  bodies,  in  1 80-degree  arc  in  front 
of  us,  with  a  bias  towards  the  lower  visual  field.  It  appears  to  use  exocentric  coordinates  with  a 
bias  to  use  gravity  and  the  earth  as  a  frame  of  reference  (Angelaki  &  Hess,  1995;  Patterson  et  al., 
1997).  It  is  represented  primarily  in  the  dorsomedial  cortex. 

The  neuroscience  literature  suggests  that  representations  have  clear  input/output 
connections.  For  example,  the  focused  extrapersonal  space  is  used  for  controlling  saccades. 
However,  what  kinds  of  spatial  representations  that  a  person  will  have  at  any  point  in  time  is  a 
complex  issue. 

First,  tasks  often  require  multiple  inputs  or  multiple  outputs.  For  example,  searching  for  a 
door  in  larger  room  requires  involves  focused  visual  search  and  overall  spatial  orientation.  Thus, 
one  simple  task  may  activate  multiple  representations. 

Second,  the  spatial  representations  are  used  in  a  coordinated  fashion.  For  example, 
overall  spatial  maps  of  the  surrounding  area  are  used  to  initialize  focused  visual  search. 

Similarly,  in  moving  to  a  door  and  opening  it,  one  first  finds  the  door  (visual  search),  walks  to  it 
(navigation),  updates  ones  location  relative  to  the  room  (localization),  and  finally  grabs  the 
handle  and  turns  it  (eye/hand  coordination).  Information  in  one  representation  is  in  part  passed  to 
the  other  representations.  For  example,  the  identity  and  location  of  the  target  object  (e.g.,  the 
door  in  the  previous  example)  are  shared  across  representations. 

Third,  individuals  may  engage  in  re-coding  from  one  representation  to  another  even  if  the 
current  external  task  does  not  require  it.  This  possibility  is  especially  likely  in  complex  tasks  that 
require  significant  cognitive  activities  (e.g.,  planning).  Because  each  representation  uses  different 
coordinate  systems  and  has  different  biases,  an  individual  may  shift  representations  depending 
on  what  works  best  for  the  current  cognitive  task. 

The  possibility  of  multiple  simultaneous  representations  that  must  be  coordinated  at  times 
with  possible  re-coding  of  information  make  prediction  and  analysis  of  behavior  in  complex 
tasks  very  difficult  from  verbal  theories  of  the  tasks  and  the  representation  systems.  To  really 
understand  how  these  representations  interact  so  that  predictions  can  be  made  for  particular 
situations,  one  must  build  a  computational  framework  that  instantiates  the  computational  and 
behavioral  properties  of  these  representations  in  a  very  precise  way. 

3.2.2  Computational  modeling  &  neuroscience  evidence 

From  a  computational  modeling  perspective,  there  is  a  real  temptation  to  take  one  of  two 
different  extreme  approaches  in  response  to  this  complexity.  The  first  extreme  approach  is  to 
completely  ignore  the  neuroscience  evidence  and  assume  a  single,  computationally  convenient 
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way  of  representing  spatial  information.  One  might  argue  that  ACT-RP/M  has  taken  this 
approach.  There,  visual  space  is  represented  in  terms  of  objects  at  precise  X-Y  locations,  without 
any  treatment  of  bias  or  orientation  issues,  without  any  treatment  of  a  third  dimension.  The 
advantage  of  this  extreme  approach  is  that  the  representation  process  is  easy  to  understand,  easy 
to  implement,  and  appears  to  work  fine  in  modeling  domains  involving  flat  displays  with 
relatively  little  spatial  reasoning.  Finally,  as  cognitive  science  has  long  known,  there  is  a  great 
deal  of  informational  equivalence  in  representations — with  a  little  bit  of  work  you  can  represent 
almost  anything  in  any  given  type  of  representation. 

The  second  extreme  approach  is  to  take  the  neuroscience  evidence  quite  literally  and 
build  a  separate  functional  box  corresponding  to  each  neurologically  distinct  region.  This  is  the 
approach  taken  by  connectionist  models  of  space  (Touretzky  &  Redish,  1995,  1996).  The 
advantage  of  this  approach  is  that  each  box  can  be  built  exactly  to  specification.  Thus,  this 
approach  appears  to  work  fine  in  modeling  very  simple,  representation-targeted  tasks — only  one 
functional  box  is  required  at  a  time  in  such  tasks. 

Both  of  these  extreme  approaches  have  their  perils.  The  first  extreme  is  clearly  going  to 
be  wrong  whenever  the  to-be-modeled  task  involves  careful  and  detailed  spatial  reasoning. 

While  there  is  some  informational  equivalence  across  representations,  there  is  not  computational 
equivalence.  In  other  words,  certain  representations  afford  certain  computations  (Larkin  & 
Simon,  1987). 

The  second  extreme  is  going  to  be  of  limited  use  in  modeling  complex  behavior,  for 
several  reasons.  First,  it  is  computationally  very  expensive  and  difficult  to  implement,  and  thus 
not  likely  to  be  used  in  the  kinds  of  situations  that  complex  spatial  tasks  involve.  Second,  it 
places  too  large  a  separation  between  representation  and  process.  Each  functional  box  tends  to 
have  its  own  processes.  This  approach  then  tends  to  miss  the  more  parsimonious  approach  of 
different  representations  with  a  common  set  of  processes  (with  common  learning  mechanisms 
and  common  performance  mechanisms).  Neuroscience  evidence  has  not  yet  ruled  out  this 
intermediate  case;  Occam's  razor  should  be  applied.  Indeed,  the  similarity  in  learning  patterns 
(e.g.,  the  ubiquitous  powerlaws  of  learning  and  of  forgetting)  across  a  wide  variety  of  tasks  is 
suggestive  of  a  common  procedural  core.  Third,  there  is  little  understanding  of  how  the  separate 
functional  boxes  interact  to  produce  a  single  behavior  at  a  given  point  in  time,  especially  in  tasks 
that  require  using  multiple  boxes. 

For  these  reasons,  I  propose  to  use  a  variant  of  the  intermediate  case:  multiple, 
independent,  and  potentially  simultaneously  active  spatial  representations  with  different 
properties,  but  all  accessed  by  a  common  core  procedural  core  with  a  common  set  of  learning 
and  performance  mechanisms. 

3.2.3  Cognitive  psychology  of  multiple  representations 

Because  of  recoding,  people  can  build  many  possible  representations  no  matter  what  the 
input,  although  there  are  likely  to  biases  in  choice  of  representation  given  a  particular  input.  For 
example,  people  can  develop  cognitive  maps  from  moving  fingers  over  map  while  blindfolded, 
walking  a  path  blindfolded,  or  viewing  a  map  (i.e.,  similar  representations  from  different  input 
modalities)  (Levine,  Jankovic,  &  Palij,  1982). 

In  addition,  a  given  task  can  result  in  multiple,  simultaneous  representations,  both 
because  multiple  representations  are  required  by  the  task  (e.g.,  reading  requires  building 
structural,  phonemic,  and  semantic  representations)  and  because  incidental,  automatic  processing 
(e.g.,  semantics  are  often  processed  during  non-semantic  tasks  like  lexical  decision).  Therefore, 
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it  is  reasonable  to  create  a  model  that  creates  multiple  layers  of  representation  and  accesses  them 
opportunistically. 

Not  all  representations  are  equally  valuable  for  spatial  problem  solving.  For  example, 
Hagerty  and  Kozhevnikov  (1999)  found  that  schematic  representations  but  not  pictorial 
representations  were  positively  correlation  with  solution  success  in  mathematical  problem 
solving.  Therefore,  it  is  important  to  examine  the  role  of  each  type  of  representation  in  problem 
solving  success. 

3.3  Variability  in  Content  of  Representations 

Even  assuming  a  fixed  kind  of  spatial  representation  format  with  a  given  set  of  perceptual 
inputs  and  fixed  coordinate  system,  there  is  still  likely  to  be  considerable  variability  in  the 
content  of  the  representations.  This  variability  can  have  a  large  impact  on  reasoning  that  uses  this 
content,  and  thus  is  an  important,  although  often  overlooked,  aspect  of  representations. 

3.3.1  Uncertainty 

The  first  layer  of  variability  stems  from  uncertainty  of  two  different  types:  perceptual  and 
conceptual.  Perceptual  uncertainty  stems  from  the  inexact  magnitude  estimates  that  the  human 
perceptual  system  produces  (e.g.,  in  estimating  distances  of  objects,  their  magnitudes,  their 
speed,  etc.)  Here  representations  can  change  simply  because  of  varying  perceptual  estimates  of 
magnitude  change  noticeably  over  time.  For  example,  one  might  revise  ones  estimate  of  the 
distance  of  an  object  several  times  over  a  relatively  short  period. 

Conceptual  uncertainty  stems  from  an  understanding  of  the  misleading  or  noisy  nature  of 
various  perceptual  inputs.  For  example,  in  reading  sonar  location  readings,  a  submarine  approach 
officer  realizes  that  the  apparent  depression  angle  of  the  target  (i.e.,  the  apparent  depth  of  the 
source  of  the  sounds)  may  be  completely  different  from  the  actually  depression  angle  of  the 
target  because  of  the  way  sound  bounces  off  the  ground  and  is  bent  through  layers  in  the  ocean. 
Gestures  made  by  the  approach  officers  reveals  that  they  are  directly  representing  this 
uncertainty:  when  the  discuss  objects  who  location  is  still  open  to  interpretation,  they  use 
vacillating  gestures  in  representing  the  object  (e.g.,  a  quivering  hand)  to  indicate  uncertainty 
about  object’s  location. 

3.3.2  Depth  of  information 

The  second  layer  of  variability  in  representations  of  a  fixed  format  stems  from  different 
kinds  of  dimensional  reduction  that  people  use.  Work  by  Shah  and  Carpenter  (1995)  suggests 
that  people  can  vary  quite  significantly  in  the  type  of  scale  they  use  to  represent  a  given 
dimension.  That  is,  even  when  people  are  given  perceptual  input  that  may  be  interpreted  as  a 
ratio  scale,  people  may  sometimes  re-represent  that  information  in  ordinal,  or  even  nominal 
terms.  This  reduction  in  dimension  quality  may  be  in  response  to  the  perceptual  and  conceptual 
uncertainties  in  the  situation. 

Applied  to  the  submarine  domain,  one  sees  that  approach  officers  often  use  nominal 
representations  to  represent  depth  in  the  water.  For  example,  they  think  of  objects  as  being  either 
above  the  water,  on  the  water,  somewhere  above  them,  at  the  same  level,  or  somewhere  below 
them.  Similarly,  they  sometimes  think  of  the  x-axis  (in  the  fronto-parallel  plane)  as  being 
ordinal:  to  the  left,  in  the  center,  or  to  the  right. 

3.3.3  Variability  in  represented  objects  and  features  over  time 

The  third  layer  of  representational  variability  is  variability  in  represented  objects  and 
features.  Recent  work  (Lovett  &  Schunn,  1999;  Schunn  &  Klahr,  2000)  suggests  that  even  in 
relatively  simple  tasks,  people  may  be  constantly  changing  which  objects  and  which  object 
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features  they  represent,  especially  when  the  task  is  currently  difficult.  That  is,  people  are 
constantly  search  for  new  things  to  include  and  exclude  from  their  representation  of  the  current 
task.  These  representation  changes  can  have  profound  effects  on  what  people  can  learn  from 
their  experiences  (Lovett  &  Schunn,  1999). 
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